256 PART 5 Looking for Relationships with Correlation and Regression
would assume the chance of dying from radiation exposure may depend not only
on the radiation dose received, but also on age, gender, weight, general health,
radiation wavelength, and the amount of time over which the person was exposed
to radiation. In Chapter 17, we describe how the straight-line regression model
can be generalized to handle multiple predictors. You can generalize the logistic
formula to handle multiple predictors in the same way.
Suppose that the outcome variable Y is dependent on three predictors called X, V,
and W. Then the multivariate logistic model looks like this:
Y
e
a
bX
CV
dW
1
1
/
_
)
(
Logistic regression finds the best-fitting values of the parameters a, b, c, and d
given your data. That way, for any particular set of values for X, V, and W, you can
use the equation to predict Y, which is the probability of being positive for the
outcome.
Running a Logistic Regression
Model with Software
The theory behind logistic regression is difficult to grasp, and the calculations are
complicated (see the sidebar “Getting into the nitty-gritty of logistic regression”
for details). The good news is that most statistical software (as described in
Chapter 4) can run a logistic regression model, and it is similar to running a
straight-line or multiple linear regression model (see Chapters 16 and 17). Here
are the steps:
1.
Make sure your data set has a column for the outcome variable that is
coded as 1 where the individual is positive for the outcome, and 0 when
they are negative.
If you do not have an outcome column coded this way, use the data manage-
ment commands in your software to generate a new variable coded as 0 for
those who do not have the outcome, and 1 for those who have the outcome,
as shown in Table 18-1.
2.
Make sure your data set has a column for each predictor variable, and
that these columns are coded the way you want them to be entered
them into the model.